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Abstract 

Tizen is a new Linux-based open source platform for 
consumer devices including smartphones, televisions, 
vehicles, and wearables. While Tizen provides kernel- 
level mandatory policy enforcement, it has a large col¬ 
lection of libraries, implemented in a mix of C and C-H-, 
which make their own security checks. In this research, 
we describe the design and engineering of a static anal¬ 
ysis engine which drives a full information flow analysis 
for apps and a control flow analysis for the full library 
stack. We implemented these static analyses as exten¬ 
sions to LLVM, requiring us to improve LLVM’s native 
analysis features to get greater precision and scalability, 
including knotty issues like the coexistence of C-H- in¬ 
heritance with C function pointer use. With our tools, we 
found several unexpected behaviors in the Tizen system, 
including paths through the system libraries that did not 
have inline security checks. We show how our tools can 
help the Tizen app store to verify important app prop¬ 
erties as well as helping the Tizen development process 
avoid the accidental introduction of subtle vulnerabili¬ 
ties. 

1 Introduction 

Static analysis has proven to be wildly successful in find¬ 
ing all sorts of bugs, whether related to security or other 
flaws, so the availability of a new system to analyze for 
bugs is an interesting opportunity to see how good these 
tools can be. To that end, we had the opportunity to de¬ 
sign and implement static analyses for Tizen, a new op¬ 
erating system platform that will soon run on a variety 
of Samsung products including televisions, wearables, 
automobile telematics systems, and smartphones. This 
paper describes the analysis challenges presented by the 
Tizen platform, as distinct from competing platforms like 
Android, along with the tools we developed and the is¬ 
sues we found. 


We’ll describe the Tizen architecture in more detail 
later, but at a high level Tizen is a variant of Linux, with 
kernel-enforced mandatory access control rules. Appli¬ 
cations can be built entirely from HTML5 web primitives 
(JavaScript, etc.), much as was done in Palm’s WebOS, 
or they can be built natively, using a variety of C and C-H- 
standard libraries. Tizen has a series of permissions that 
can be granted to applications in a fashion similar to An¬ 
droid, which are then enforced both at a low-level, using 
the kernel, along with higher-level checks embedded in 
the libraries. Native apps will be distributed as LLVM 
bitcode—a portable, machine-independent intermediate 
code representation that’s naturally amenable to static 
analysis via the LLVM toolchain. We presume there will 
be a centralized Tizen app store—Samsung just opened 
TizenStore.com in January of this year—that can con¬ 
duct analyses over Tizen apps to ensure their safety prior 
to being downloaded to Tizen user^ In a recent talk, 
Samsung’s partner, AhnLabs, described a mixed process 
with both static and dynamic analysis as well as human 
analysts ifTlIS^. 

In deciding what aspects of the Tizen system were 
interesting for a security-related static analyses, we de¬ 
cided to focus our attention on higher-level security top¬ 
ics. For native Tizen apps, we concluded that it would 
be most helpful to have a general-purpose LLVM infor¬ 
mation flow analysis tool that could identify apps con¬ 
taining undesired flows, such as from the user’s contacts 
list to the network. We envision this automated analysis 
being conducted mechanically in an app store alongside 
a human analyst who studies the effectiveness of vari¬ 
ous source/sink pairs, amending the rules as needed. The 
goals of this tool are to run quickly and to produce use¬ 
ful evidence that can quickly allow safe apps to be ap- 


* While the authors of this paper are blinded for review, we note 
that we do not represent Samsung, Intel, or any other commercial com¬ 
pany involved in Tizen. All the work presented here is based on pub¬ 
lic information including Samsung’s open-source release of the Tizen 
codebase. 



proved, allowing human analysts to spend more of their 
time digging into suspicious apps with unusual behav¬ 
iors. We prefer information flow analyses over more 
primitive cataloging of privileged operations, as done in 
the Tizen store presently mM, because we hypothe¬ 
size it will result in fewer false positives. For example, if 
a privacy-sensitive advertising library downloaded sev¬ 
eral ad variants, selecting one for display based on how 
well it matches platform-local private information about 
the user, this would be far less concerning than leaking 
that same private information over the network for the 
decision to be made remotely. While both variants use 
the same permissions, information flow can distinguish 
the good from the bad. 

For the Tizen system libraries, written in a mix of 
C and C-H- and containing internal security checks that 
make them part of the system’s sizable trusted comput¬ 
ing base, we face a larger challenge. These libraries en¬ 
force security properties while they are simultaneously 
linked to the same address space as the potentially hos¬ 
tile apps that call them. We consequently expect that the 
Tizen app store will need to statically analyze apps to 
ensure they only branch to approved entry points in the 
system libraries and that they don’t exploit unsafe prop¬ 
erties of the C language (e.g., indexing beyond the end of 
an array, overwriting a function pointer, and branching 
to a forbidden target). Such “safety” analyses are well 
within the province of existing commercial tools, so we 
didn’t implement them. Furthermore, apps built using 
the web stack (JavaScript, etc.) call into the very same 
libraries, pointing to the importance of validating these 
entry points’ use of security checks. 

Consequently, we decided to implement a control flow 
analysis over the native libraries in order to discover 
whether there are paths through the libraries that are 
missing security checks, and thus might indicate ex¬ 
ploitable flaws that such a “safety” analysis in the app 
store might otherwise approve. Unlike our information 
flow analysis for Tizen apps, we envision this Tizen li¬ 
brary analysis to be something that can run for hours, 
if not days, in the service of Tizen system developers’ 
internal bug finding. Likewise, we envision that Tizen 
system developers would be able to add trusted code an¬ 
notations to inform this analysis, although it’s essential 
that such annotations be few and far between, in order to 
minimize friction to the adoption of our tool. 

The rest of this paper describes Tizen in more detail 
(Section [^, then presents our LLVM-based static anal¬ 
ysis engine (Section 1^. We follow with our analysis of 
Tizen apps (Section^ and API libraries (Section]^. We 
discuss pragmatic issues (Section]^. We wrap up with 
prior work (Section]^ and conclusions (Sectionj^. 


2 Background 

The Tizen platform ll55l is an operating system based 
on the Linux kernel and the GNU standard C library. 
It includes a graphics layer based on the Enlightenment 
Foundation Libraries and the X Window System. 

Tizen already runs on smartphones Eol, wearables 
such as watches Il46l . cameras ll48l . vehicle infotain¬ 
ment systems ill, TVs 1491 and in the future refriger¬ 
ators, air conditioners and washing machines B3l . Con¬ 
sequently, its security properties become quite important. 
The Tizen libraries are implemented as a C-H- layer of 
programmer-accessible APIs built on top of a C layer 
of APIs that are deliberately hidden from the application 
programmers. The intention is that that application pro¬ 
grammers won’t deal, for example, with the X Window 
System, but rather will use Tizen’s official graphics APIs. 

2.1 Tizen Applications 

Applications can either be based on HTML5 or native 
apps. This paper focuses on security analysis of the na¬ 
tive applications, which use the C standard library and 
additional Tizen APIs that offer access to phone calling 
and contacts, SMS, networking, Bluetooth, and other ser¬ 
vices as shown in Figure 

The availability and wide range of these APIs makes 
Tizen a unique target for analysis since the entirety 
of kernel, standard libraries, standardized application 
platform and the applications themselves are written in 
C/C++ and compiled to native cod^ and are, in effect, 
all within the trusted computing base of the platform. 

2.2 Tizen Privileges 

The main mechanism for enforcing privacy and security 
for applications is a system of privileges, functionally 
similar to that of Android. The application privileges are 
displayed to the user ahead of installation, with applica¬ 
tions only being downloaded and installed once the user 
accepts the privileges that the application requires. 

From a security standpoint, the use of C/C++ for the 
Tizen libraries—widely known as difficult to analyze 
with its use of function pointers, aliased arrays and deep 
class hierarchies—together with the existence of rich ap¬ 
plication APIs, each with their own associated permis¬ 
sions, makes determining the correctness of Tizen’s priv¬ 
ilege system a serious challenge. Even for Android, 
where privileges are enforced outside of a potentially 

^The distribution format for native apps will actually be LLVM’s 
bitcode IR. This intermediate representation, much like Android’s 
Dalvik, will be compiled at install-time to the platform’s native CPU 
architecture. 
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Figure 1: Tizen application development stack. In this paper, we focus on the native applications stack. 


hostile application’s address space, researchers have dis¬ 
covered multiple permissions inconsistencies inside the 
OS libraries CD and several different types of permis¬ 
sion misconfiguration ED ED, leading to application 
over-privilege ll59l and increased application vulnerabil¬ 
ity Esa. 

While, to the best of our knowledge, Tizen does not 
have a security document explaining the rules of privi¬ 
lege enforcement, by analyzing the code, we observed 
the following rules. 

• As a first layer of defense, applications are checked 
for security vulnerabilities before their inclusion in 
the web store. 

• Second, an access controller invoked by each 
privileged API denies access to the native APIs 
for which an application does not have the priv¬ 
ilege. This is done by including a call to 
CheckPrivilege(privilege_ncune). 

• Third, since checks done in the application process 
may be avoided by an attacker, protected actions 
are performed or information is retrieved from other 
service processes, which perform their own check¬ 
ing for permissions. 

• At the bottom level, the inter-process communica¬ 
tion and data access is protected by a kernel-level 
security module (SMACK), described below. 

On its surface, this appears to be an example of defense 
in depth, i.e., perhaps the higher-layer checks are unnec¬ 
essary and SMACK can carry all the security burden, but 


we hypothesize that the checks at each layer are neces¬ 
sary, as higher-level API semantics may be lost when 
control flow reaches the system-call boundary. SMACK 
may not have adequate context to make every security 
decision correctly on its own. 

2.3 SMACK 

Simplified Mandatory Access Control Kernel (SMACK) 
is a Linux kernel module and associated utilities that 
allow setting custom mandatory access control (MAC) 
rules to protect data and limit process interaction. 

The combination of mandatory access control policies 
and API privileges for more fine-grained permissions is 
the standard combination of protection mechanisms in 
Android, which has its own permissions API and system- 
level enforcement. Recent versions of Android also in¬ 
clude SELinux, which can enforce policies similar to 
SMACK. 

SMACK relies on labeling system objects and 
then applying rules, based on those labels, to 
allow or prevent access. Its rules format is 
subject-label object-label access, where 
subject-label is the SMACK label of the task, 
object-label is the SMACK label of the object being 
accessed, and access is a string specifying the type of 
access allowed. We note that the SELinux policy for 
Linux 2.4.19 consists of over 50,000 policy statements, 
including over 700 subject types and 100,000 permission 
assignments EH- While Tizen’s SMACK is simpler 
than SELinux, Tizen 2.1 has 41,000 lines of SMACK 
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access rules 1521 . It’s manifestly unclear whether these 
rules are “correct” or how to even define correctness 
over them. 

3 Static Analysis Engine 


invoked to create additional in-memory information, in¬ 
cluding the heap static-single assignment (HSSA) form 
(more detail is provided in Section |3.2| l, class hierarchy 
information, class type information and the call graph. 
The “refined in-memory LLVM IR” is the in-memory 
LLVM IR augmented by this additional information. 


The motivation for this work is to identify security bugs 
in a C/C-H- code base through static analysis. The code 
base could be a mobile application (i.e., a Tizen app) or 
an operating system (i.e., Tizen). We built our analysis 
infrastructure on top of the LLVM framework. Figure 
shows the basic flow of our analysis system. The CIC++ 
code is compiled and translated to LLVM bitcode by the 
Clang 1391 compiler. The bitcode is input to the LLVM- 
based analysis engine, which performs various informa¬ 
tion flow-based analyses to identify security bugs. The 
analysis is driven by user-specified analysis rules, e.g. 
pairs of taint source and taint sink functions. 



security bugs 


LLVM bitcode 



Figure 3: The Internal Workflow for LLVM based Anal¬ 
ysis Engine 


Figure 2: The Basic Workflow 


This section will describe the analysis engine, includ¬ 
ing the basic components, workflow and the mechanism 
used to identify two types of security bugs. The LLVM- 
based analysis engine applies static analysis to the input 
bitcode and identifies security bugs, i.e. flows that vi¬ 
olate the analysis rules. Section o gives an overview 
of the software architecture of the analysis engine. Sec¬ 
tion 3.2 describes the static analysis techniques used in 


this engine and the interactions among them. The last 
two sections describe how the analysis finds privilege er¬ 
rors and taint pairs in two different kinds of code bases: 
i.e. Tizen applications and the Tizen operating system. 


3.1 Structure of Analysis Engine 


Here we summarize the functionality of the auxiliary 
analyses & transformations, and the interactions between 
them: 

• Class Hierarchy Analysis (CHA): builds the class 
hierarchy graph for C-n- code; 

• HSSA builder: constructs the HSSA form; 

• Pointer Analysis (PTA): intra-procedural pointer 
analysis; 

• Global Value Numbering (GVN): global value num¬ 
bering based on PTA; 

• Class Type Analysis (CTA): this is a flow-sensitive 
class type analysis that is based on CHA, HSSA, 
PTA and GVN; 


Figure shows the structure of our analysis engine, 
which is built on the LLVM framework (the bold boxes 
indicate components that we have added). The engine 
takes LLVM bitcode as input and translates it into an 
in-memory LLVM intermediate representation (a three- 
address static-single assignment based IR). A client anal¬ 
ysis is a static information flow analysis (SIFA) that runs 
on the LLVM IR and identifies security bugs. To as¬ 
sist the client analysis, a series of auxiliary analyses are 


• Call Graph Builder (CG): the call graph construc¬ 
tion based on CTA which can precisely identify the 
invoked virtual function calls, including function 
pointer invocations. 

3.2 Basic Techniques 

This section gives a more detailed description of the 
functionality of the auxiliary analyses and transforma- 
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tions. The pointer analysis (PTA) and global value num¬ 
bering (GVN) are standard LLVM analysis modules. The 
pointer analysis is an intra-procedural stateless analy¬ 
sis that uses allocation sites to distinguish memory ad¬ 
dresses. The global value numbering uses alias infor¬ 
mation produced by pointer analysis to number the heap 
variables that have distinct values. 


3.2.1 Class Type Analysis 

In Ch-h- code, the analysis needs to identify a minimal 
set of possible class types in the presence of class in¬ 
heritance. This helps the call graph builder to precisely 
identify the target of virtual function calls. The first step 
of class type analysis (CTA) is class hierarchy analysis 
(CHA), which examines the class information to build 
the tree structure that represents the C-n- class hierar¬ 
chy. Figure[2(a) gives a simple class hierarchy example, 
where classes B and C are subclasses of class A. The 
class hierarchy tree is presented in Figure[^(b). 

The next step of CTA is to start from class instantia¬ 
tion sites and propagate class type information via vari¬ 
ables’ def-use chains. LLVM provides scalar variable- 
based SSA form to represent def-use information for 
scalar variables. For heap variables, CTA needs assis¬ 
tance from pointer analysis. In Figure|^(c), the example 
code presents a case where pointer analysis information 
can disambiguate class types. In Line 5, the value of vari¬ 
able m is loaded from p^x, which can be an instance of 
class B or C. To identify the type of variable m, we need 
to know if variables p and q are aliased or not. If from 
pointer analysis we know that p and q cannot be aliased, 
then m’s class type is B, and the invoked function/oo in 
Line 6 is B::foo. Otherwise, both B::foo and C::foo may 
be invoked at Line 6. 

We now describe an interprocedural, flow- and held- 
sensitive class type analysis that starts from class in¬ 
stantiation sites and propagates class type information 
via variables’ def-use chains. The def-use information 
is built upon both scalar SSA (for scalar variables) and 
HSSA (for heap variables, see more details in the next 
section). For each scalar variable dehned, all of its uses 
are checked and their class types are updated. If the use 
is a merge (j) function, a meet update operation is per¬ 
formed, i.e. merging the class type into the merge (j) 
function’s class type set. For each heap variable defined, 
all of its may-alias uses are checked and their class types 
are updated (i.e. merging the class type into the heap 
variable’s class type set). The operation of the heap vari¬ 
able’s merge (j) is the same as for scalar variables. 


3.2.2 Heap Static-Single Analysis Form 

Information flow analysis discovers the flow of values 
between variables in a given application. The variables 
can be scalar or heap variables. Heap SSA (HSSA) 
form m is used to represent the definitions and uses of 
heap variables, i.e. class/struct held and array accesses in 
the C/C-H- context. For each heap variable dehnition and 
use, a pseudo-variable Hi is used to annotate the heap 
variable access, where a d(j) function is used for dehni- 
tions and a ucj) function for uses. The d(j) and ucj) func¬ 
tions take the heap address (e.g., p) and offset (e.g., the 
offset of struct Info’s held x) as input parameters that rep¬ 
resent the heap position. Similar to scalar SSA, a merge 
0 node is used to merge d^ or nodes where con¬ 
trol how edges join. Figure |^(d) shows the transformed 
HSSA form from Figure (c). Two dcj) functions (i.e.. 
Hi and H 2 ) are added to heap dehnitions at Lines 1 and 
3, one ucj) (i.e., H 4 ) is added to a heap use at Line 4, and 
a merge (j) node is used to merge Hi and H 2 . 

Recall from the CTA algorithm, that class type infor¬ 
mation can be propagated via HSSA def-use chains. At 
Line I, Hi is assigned class type B. H 2 is assigned class 
type C. Hi propagates its class type information through 
HSSA def-use chains, and reaches H 4 as a use, since Hi 
and H 4 are must-aliases. So H 4 takes on class type B. For 
variable m at line 4, its class types depend on the type of 
p and q, since the dehnition of H 4 comes from Ht, which 
merges Hi and H 2 . If p and q may aliases, then m takes 
class type B and C. If p and q must not alias, then m 
takes class type B only. Building the HSSA form simpli- 
hes the manipulation of heap variables for analysis. The 
may/must alias checking gets help from pointer analysis 
or value numbering (i.e., the GVN in LLVM). 

For function invocations, HSSA connects those call 
sites whose target functions have a side effect (i.e. a load 
or store of a heap variable). For example, a function 
is assigned to the invocation of function/oo at Line 5 in 
Figure|^(d), since function /00 performs a load operation 
on heap variable B::val. 

3.2.3 Call Graph Construction 

As discussed above, precise call graph construction (CG) 
for C-H- code needs precise class type information to 
identify virtual function calls. Based on the CTA analysis 
output, the CG builder starts from entry functions. In this 
paper, entry functions are the main functions and event 
handler functions in the Tizen OS code base and mobile 
applications. For each indirect function invocation (i.e., 
invocation via a function pointer), if it is a virtual func¬ 
tion invocation (i.e., the function pointer is loaded from a 
class object’s virtual table), the target object’s class type 
information is used to identify target functions. For a 
non-virtual indirect function invocation, CG builder uses 
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class A { 
public: 

virtual void foo(int r); 
int val; 

}; 



class B : public A { 
public: 

virtual void foo(int r); 

}; 


class C : public A { 
public: 

virtual void foo(int r); 

}; 

a. 


typedef struct { A* x; int y; } Info; 

int main(int argc, char** argv) { 
Info* p, q; 

1: p->x = new B(); 

2: if (argc > 1) 

3: q->x = new C(); 

} 

4: A* m = p->x; 

5: m->foo(val); // B::foo or C::foo ? 
0 . 


1: p->x = newB(); Hi = dphi(Ho, p, x); 

2: if (argc > 1) { 

3: q->x = new C(); H 2 = dphi(Hi, 

} 

Hs = mphi(Hi, H 2 ); 

4: A* m = p->x; H4 = uphi(H3, p, x); 

5: m->foo(val); Hs = uphi(H4); 

void B::foo(int r) { 

int X = this->val; Hi = uphi(Ho, this, val); 

} 



d. 


Figure 4; The Class Type Analysis Example 


pointer analysis information to identify the target func¬ 
tions. 


3.3 Handling C/C++ Features 

C/C++ has features that pose difficulties for static anal¬ 
ysis, such as the coexistence of C++ inheritance with C 
function pointer use, the coexistence of classes/structs, 
and array elements accesses in the form of offsets from 
pointers. SIFA’s call graph construction, as described 
above, integrates the handling of invocations through 
function pointers with the handling of virtual function 
calls. SIFA extends Heap SSA (HSSA) form ll^ to 
represent memory accesses through class/struct field ac¬ 
cesses and arrays in a uniform way. The original work in 
heap SSA only supported Java objects, and was extended 
for C/C++ objects in this work. 


Where a definition is assigned to by a 0-function, it 
becomes Tainted if any of the arguments of the (j)- 
function are tainted. Thus the meet operation for the 
lattice is defined as: meet (Tainted, Untainted) = 
Tainted. For each taint source function invocation, 
the tainted value is propagated through scalar SSA and 
HSSA def-use chains. When a taint source reaches a 
corresponding sink function, then a taint pair is identi¬ 
fied and reported to output. 


x= TaintSourceQ; 
p->y = X + 99; 
evaluate(p); 


^— x= TaintSourceQ; Hi = dphi(Ho) 
C-*p->y = X + 99; H 2 = clphi(Hi, p, 

evaluate{p); Ha = uphi(H4, p, y);^-4 


void evaluate(lnfo* p) { 
int m = p->y; 
TaintSink{m); 


void evaluate(lnfo* p) { 
j'—'int m = p->y; Hu = uphi(Ho, p, x) 
^-^aintSink(m); H 12 = uphi(Hii); 




4 Tizen Application Analysis 


The Tizen application analysis is an interprocedural 
SIFA analysis that identifies pairs of taint source and 
sinks for the given application code. The taint source and 
sink pairs are defined by user-specified rules, i.e. the taint 
source function as the key and a set of taint sink functions 
as values. Taint analysis can be used to model different 
security issues, such as privacy leaks and unauthorized 
resource access. Here we focus on privacy leaks. The 
analysis engine loads the user-defined taint source and 
sink map into memory, and analyzes the mobile appli¬ 
cation code (represented as FFVM IR) to identify taint 
source function invocations. 

Fike all data flow analyses, taint analysis defines an 
associated lattice and meet function. The top element 
of the lattice is Untainted. The bottom element of the 
lattice is Tainted. These are the only two lattice el¬ 
ements. A variable definition is assumed to be initial¬ 
ized to Untainted, and becomes Tainted if it is as¬ 
signed to by an expression containing a tainted value. 


Figure 5; Example of Interprocedural Taint Analysis 

Figure (a) shows an example where a taint source 
and sink are identified across procedure boundaries. The 
TaintSource function is invoked and produces the result 
value X that should be marked as tainted. By propagating 
through dataflow analysis, all variables in the computa¬ 
tion reached by the tainted value are marked as tainted. 
In the function evaluate, the sink function TaintSink is in¬ 
voked and has tainted variable m as input. Thus the taint 
source reaching its corresponding taint sink is identified. 
To illustrate the dataflow traversal. Figure (b) gives 
the HSSA version of the code, and the arrow lines show 
the taint lattice value propagation through the scalar and 
heap variable def-use chains. 

We perform taint analysis in time that is linear in the 
size of the HSSA graph. The implementation is currently 
context insensitive. 
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4.1 Implicit Flows 


4.3 Callback functions 


Our static taint analysis, unlike existing tools ( IEIEtI 
El), identifies implicit flows ll^Sll due to control depen¬ 
dences between (source, sink) pairs. This is needed to 
ensure that a malicious program cannot sidestep the taint 
flow policy rules through tricky conditionals and control 
flows. Our method integrates control-based and dataflow 
propagation for taint analysis. 

For each function, the analysis tracks implicit flows by 
identifying control predicates and the statements that are 
control dependent on them. A prepass inserts pseudo¬ 
uses of the control predicate for each such definition, 
effectively turning the control dependence relation into 
a dataflow relation through which the analysis engine 
propagates taints. If the control predicate is tainted, the 
taint analysis classifies all variable definitions control de¬ 
pendent on the predicate as tainted. To implement this, 
control predicates are inserted as pseudo uses in each 
conditional statement prior to the taint analysis. 

Consider the code example in Figure]^ 


Callback functions pose an interesting challenge because 
they can enable “hidden” information flow via event- 
driven execution. Consider for instance, the snippet of 
code (shown in Figure 0 in which an application uses a 
callback function to preview a snapshot captured by the 
camera device: 


class FaceTrackerForm : public 
Tizen: :Media: :ICameraEventListener { 

public: 

void OnCameraPreviewed{ 

Tizen: :Base: :ByteBuffer& previewed Data, 
result r) { 

ArrayList *pList = null; 

ByteBuffer* pBuffer = new {std::nothrow) 
ByteBufferQ; 

pBuffer->Construct{previewedData); 
pList = new {std::nothrow) ArrayList; 
pList->Construct{); 
pList->lnsertAt (*pBuffer, 0); 


C x= TaintSourceO; 
if (x<0){^ 
yi = 0; /^seudo_use{x); 

} else { J 
yz = t; /i»pseudo_use(x); 

} 

// Taint propagates trom yi and yz 
^ys = phi(yi, yz); 

^TaintSink(y3); 


Figure 6: Example for Pseudo-Use 

In the code example, the tainting of x is propagated to 
yi and y 2 through the insertion of pseudo uses. Thus is 
tainted, and so a privacy leak occurs at TaintSinkCya). 


4.2 Input Rules 

Taint rule specification is usually done by identifying 
sources and sinks at the API level, but this may lead 
to unnecessary loss of precision, especially for lan¬ 
guages such as C and C-n- that need to account for ref¬ 
erence parameters, pointer parameters and inheritance. 
We allow for a more refined specification in which the 
source and sink are identified as specific API param¬ 
eters (including return values) of APIs. For exam¬ 
ple, it is possible that a security analyst may consider 
image, but not metadata, to be a taint source in an API 
call like GetImageC&image , femetadata). Likewise, 
f ileneune, but not mode, may be considered to be a taint 
sink in an API call like OpenFile(f ileneime, mode). 


Figure 7: Example for Event Handler 

In the example above, the FaceTrackerForm 
class implements an interface function called 
OnCcunercLPreviewedO exposed by the 
Tizen;;Media;;ICameraEventListener class. 
The ICcuneraEventListener class is meant to 
provide callback functions to retrieve data in an 
event-driven fashion. In the example above, the 
OnCcunercLPreviewedO function retrieves the captured 
snapshot packaged as a (Tizen) ByteBuffer. 

To handle callback functions in our taint analysis, the 
input rules make use of a type attribute, where for call¬ 
backs the type is set to event. These extra attributes 
compensate for our desire not to include the entirety of 
the Tizen system libraries as part of our information flow 
analysis of potentially hostile apps. Instead, we only 
need to annotate the various library entry points and their 
callback behaviors. 

4.4 Ranking of Vulnerabilities 

The output of the taint analysis is a prioritized list of 
vulnerabilities. Each vulnerability rule assigns a sever¬ 
ity level. Eor instance, leaking one’s location might be 
considered to be a lower severity than leaking one’s SMS 
messages. The vulnerabilities detected by taint analysis 
are ranked primarily according to their severity and sec¬ 
ondarily according to the distance between source and 
sink in the application, under an assumption that a longer 
distance from source to sink represents less of a security 
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threat. Of course, all of this is still reported to the human 
analyst. 

Where vulnerabilities have the same severity level, 
their relative ranking is based on their distance metrics. 
A shorter distance results in a higher rank. The attributes 
call_distance and control_distance define a metric for 
the distance between the source and the sink in the ap¬ 
plication. The call_distance value is one if the path 
from the source to the sink includes a function call and 
is zero otherwise. Where the source dominates the sink 
in the control dependence graph, the control distance is 
the length of the path between them. Where the source 
does not dominate the sink, the control distance is the 
sum of the distances between each node and their least 
common ancestor. The control_distance value is only 
relevant when the value of call_distance is zero. 

As in Gll, the ranking of the vulnerabilities can be 
used to sort them so that the most likely errors appear 
closer to the top of the vulnerability list generated by the 
taint analysis. A tunable cutoff threshold (e.g., top 100) 
of the vulnerabilities can be included in the output report. 
A smaller threshold will decrease the false positive rate 
but increase the false negative rate. 

4.5 Tizen Application Analysis Evaluation 

4.5.1 Tizen Application Analysis Resnlts 

We wrote a rule set for Tizen application analysis, based 
on Tizen security policies. Using our tool, we were able 
to find unexpected behavior for an application. 

We used 30 Tizen sample native applications, which 
were the only available applications during the time of 
this research. We created rules to detect privacy leaks and 
unauthorized resource accesses involving the file system. 
These are among the security vulnerabilities that Tizen- 
Store.com would check for to ensure the safety of Tizen 
apps prior to being downloaded to Tizen users. For both 
cases, rules consist of one taint source API and one or 
more taint sink APIs. We also checked colluding apps, 
which needs support for identifying taint pairs cross IPC 
cals. For this case, we created two rule sets for collud¬ 
ing applications (one for “producer” applications with 
information flow from the SMS to IPC calls), and one 
for “consumer” applications with information flow from 
IPC calls to File). The two rule sets are marked that the 
analysis engine can recognize them and apply them as 
producer/consumer pattern. 

Our tool identified one privacy leak in the 
FriendFinder application without any false pos¬ 
itives or false negatives. In the FriendFinder 
application’s ConnectionMetnager class, there is a 
function GetImagePathPtr that retrieves the path 
information as a string. In the same function, there 


is a BluetoothOppClient; : PushFile function that 
takes the output string of GetImagePathPtr. This 
induces a privacy leak because the GetImagePathPtr 
API is obtaining a profile picture (i.e., file name) 
of the user and sending it to another device via the 
BluetoothOppClient;;PushFile API. 

With a finding like this, an analyst looking at this re¬ 
port might conclude that FriendFinder is operating as ex¬ 
pected, sending profile pictures through the Bluetooth 
connection would seem to be an expected behavior for 
the app. If there were a flow to the network, however, 
then the analyst would have reason for concern and might 
take action to ban the app. 

4.5.2 Tizen Application Analysis Performance 

We ran SIFA on a quad-core Intel Xeon 2.66GHz work¬ 
station with 8GB of memory and running RedHat Linux 
(RHEL 5). The largest application is MediaApp, which 
contains 129,375 bitcode instructions. MediaApp took 
the longest time to analyze: it took 22.82 seconds for 
total execution. The analysis, which includes the re¬ 
lated LLVM analysis pre-passes, pointer analysis, and 
interprocedural taint analysis, took 20.351 seconds. Our 
experiment shows that the tool can consume more than 
10,000 LLVM bitcode instructions per second (i.e., about 
3,000 lines of C-n- code per second) on average. We 
also measured peak memory usage using Valgrind and 
the largest memory consumption came from MediaApp, 
which required 3.098GB. 



Figure 8: Tizen Application Analysis Throughput 

For contrast, we note that Google’s Play Store for An¬ 
droid introduces several hours of latency between when 
an app is submitted and when it becomes live for produc¬ 
tion. The CPU and time costs for performing our infor¬ 
mation flow analysis are negligible compared to the time 
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a human analyst might spend understanding them and 
considering whether the results are appropriate for the 
app’s claimed functionality. And, of course, as the vol¬ 
ume of submitted apps grow, standard cluster resources 
can be used to conduct concurrent analyses, indepen¬ 
dently, with human analysts engaging after the analyses 
are complete. 

5 Tizen API Analysis 

Tizen API analysis (TAA) identifies paths from native 
API calls to low-level system (Linux) kernel calls to test 
for potential violations of user privileges. It performs 
a dataflow analysis on top of the call graph to identify 
information flow. Here the propagated information is the 
set of user privileges exercised along call paths. 

The user-specified privilege rules are inputs to the 
analysis, defined as: 

1. A set of (source, sink) pairs, where each source is a 
native API call and each sink is a glibc call, which 
is a wrapper for a kernel call; 

2. A set of user privilege properties (UPVS) that call 
paths from the source to the sink, for each (source, 
sink) pair. 

TAA traverses the call graph in a top-down manner 
from each entry function (the call graph here is a for¬ 
est), and starts a new call path trace when a source call 
is identified. An entry function here is an event handler 
function in the Tizen OS code base. The call path trace is 
performed on the call graph by means of HSSA. For each 
path in the library code base, the TAA collects the set of 
user privileges (PVS) exercised along the call path and 
stored the call path into a candidate list when a sink call 
is identified. The privilege is checked from a CheckUser- 
Privilege function call but the user can also specify other 
special function calls for identifying user privileges. A 
call path is a potential violation of user privilege proper¬ 
ties, iff its PVS contains an element that is not in UPVS 
(i.e. PVS is not a subset of UPVS). 

Figure (a) shows an example where the code base 
contains a user-specified source: TizenNativeAPI, sink: 
glibcCall, and the check privilege function: CheckUser- 
Priv. There is a call path from ButtonEvent evaluate 
—> BlueToothOp. There are two user privileges exercised 
in this call path: PRV_1 and PRV_2. Figure |^(b) gives 
the HSSA version of the code (only function based uphi 
nodes need to be considered in this analysis), and the 
arrow lines show the progress of updating PVS in the 
HSSA def-use traversal for call path. 

The output of TAA is a list of such call paths that po¬ 
tentially violate user privilege properties. The output in¬ 
cludes the source Tizen API function, the sink Linux ker- 


void ButtonEvent{) { 
x= TizenNativeAPIQ: 

CheckUserPriv(PRV_1); 
evaluate(p): 

> ' 

void evaluate(lnfo' p) { 

CheckUserPriv(PRV_2): 

BlueToothOpQ: 

} ’ 

void BlueToothOpO { 
glibcCall(nn); 

a. 

Figure 9: Example of Tizen API Analysis 

nel function, and the full call path from source to sink. 
This analysis can lead to false positives and false nega¬ 
tives. Since this SIFA runs on API library, it can be ex¬ 
tended to model additional security issues, such as unau¬ 
thorized resource access. 

5.1 Tizen API Analysis Evaluation 

We wrote a rule set for Tizen API analysis based on the 
Tizen security policies discussed above. Using our tool, 
we were able to find several unexpected behaviors of the 
Tizen APIs. 

5.1.1 Tizen API Analysis Results 

For the Tizen API analysis, we started with a simple rule 
to test whether Tizen enforced privilege checks for the 
privileged APIs. Our tool located a privileged API which 
didn’t follow the API documentation ll45l . This bug 
allows applications to receive push notifications with¬ 
out owning one of the two required privileges. While 
this API requires _PRV_PUSH and _PRV_HTTP accord¬ 
ing to the API documentation, it only checks for the 
_PRV_PUSH privilege. Our tool detected this inconsis¬ 
tency. 

Furthermore, we found two API calls which have the 
functionality of registering an application to the applica¬ 
tion launcher so it can run when a specified condition is 
met (comparable to an Android app’s ability to register 
to receive a broadcast intent). One API call can regis¬ 
ter any application while the other can only register its 
caller. Our tool detected that the broader API call is vul¬ 
nerable in that it doesn’t have a required privilege check 
while the other API has it. Thinking we found a sig¬ 
nificant vulnerability, we dug deeper and followed the 


void ButtonEventO { 

x= TizenNativeAPIO; Hi = uphi(ldQ) 

7)PVS={> 

CheckUserPriv(PRV_1): Hz = upm(Hi) 


evaluate(p); Ha = uphi{H 2 ) 

} 

void evaluate(lnfo* p) { 


CheckUserPriv(PRV_2); Hii = uphi{Ho) 


'^VS = 


■^^in 


{PRV_1} 




BlueToothOpO; H 12 = uphi{Hii^ py 3 „ 

{PRV_1, PRV_2} 


void BlueToothOpO { 
glibcCall(m); H 21 = 


uphi(Ho) 
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subsequent execution path manually. We ultimately dis¬ 
covered that the app launcher, itself, which receives these 
calls makes its own security checks. While this finding 
could be interpreted as a false positive, the discrepancy 
between security checks taking place on different levels 
for the same mechanism is something that deserves man¬ 
ual scrutiny. Our tool allowed us to focus our attention on 
an API call that indeed appeared to have an exploitable 
hole. 

Our analysis also highlighted several InputMethod 
APIs. None of the InputMethod’s privileged APIs had 
privilege checks, including the SendText API. Again, we 
manually followed the calls and discovered that, unlike 
other classes’ privilege checks, InputMethod enforced 
privilege checks in Getinstance when the application re¬ 
trieves an instance of InputMethod. In this respect, In¬ 
putMethod follows something of a capability-style of ac¬ 
cess control (i.e., if you hold a valid instance, then you 
must be allowed to use it). So, while we again didn’t find 
a vulnerability, we did find a coding style at odds with 
the way the rest of the APIs do their security checks, de¬ 
serving of additional scrutiny. 

Lastly, we wrote another rule that detects flows from 
the privileged APIs to non-privileged APIs. The intu¬ 
ition behind this rule is that if a privileged API only uses 
non-privileged APIs, the privilege check is unnecessary. 
We found a privileged API which deletes all cookies in 
an application that could be replicated only using non- 
privileged APIs. While this doesn’t indicate a security 
hole, it does validate that our tools is capable of discover¬ 
ing both missing security checks as well as unnecessary 
ones. 

Overall, while we’re modestly disappointed that we 
didn’t And any security flaws, we note that a massive 
codebase like Tizen, with a large stable of developers 
contributing new code on a regular basis, creates logis¬ 
tical challenges for the security analysts trying to keep 
up with it. A tool like ours, running as part of a nightly 
build system, allows an analyst to detect new flows and 
control paths that might have innocently introduced se¬ 
curity vulnerabilities. 

5.1.2 Tizen API Analysis Performance 

Our analysis ran on a quad-core Intel Core 17-3770 
3.50GHz workstation with 8GB of memory, running Fe¬ 
dora Linux and LLVM 3.3. The test bed is a part of the 
Tizen platform consisting of 4,346 C/C-H- files compiled 
into LLVM bitcode files with a total size of 560MB. The 
analysis time for generating the call paths for all APIs 
took 122.5 secs with memory usage under 8GB. This is 
fast enough that it could reasonably run not only as part 
of a nightly build process but as part of a regular devel¬ 
oper’s source code commit process, flagging new flows 


before the change hits the code repository. 


6 Pragmatic Issues 

Our static analysis tool leverages the LLVM analysis in¬ 
frastructure and so depends on the use of the LLVM com¬ 
piler. For the Tizen native application analysis, LLVM/- 
Clang is the default compiler. However, the Tizen plat¬ 
form code is compiled using GCC. To compile the Tizen 
platform code with LLVM, we had to address issues that 
other large-scale static analysis tools — such as Coverity 
— also had to address when processing real-world soft¬ 
ware; the issues raised by standards, language dialects 
and compiler variations ca. In short, to use the LLVM 
infrastructure, we had to make two changes to the Tizen 
source distribution. 

First, we needed to change the compiler from GCC to 
Clang, which generates the LLVM bitcode that is input to 
the LLVM analysis infrastructure. Since GCC and Clang 
are not completely compatible ifTSll . this step involved 
manual inspection of each module. We edited each build 
file and made source code changes as needed to remove 
errors. Changes, in some cases, included editing of as¬ 
sembly code. 

Second, Tizen uses a variety of different build sys¬ 
tems (CMake, libtool, and traditional makefiles). Con¬ 
sequently, each module is a new adventure in software 
porting, both in terms of the initial compilation step and 
as well in terms of linking. 

Consequently, we had to decide when we had enough 
coverage to validate our tool and approach. The Tizen 
source is divided into different source packages and we 
successfully compiled 159 out of 390 Tizen framework 
packages to LLVM bitcode, generating more than 4,000 
LLVM bitcode files with a total size of 560MB. We com¬ 
piled all the packages from the top two layers: OSP and 
the CAPI layer, which handles the native application. We 
picked underlying components’ packages that were di¬ 
rectly relevant to the privileged APIs such as telephone, 
messaging, system, and etc. We did not compile pack¬ 
ages that were not relevant to the privileged APIs such as 
graphics, UI, and multimedia. 

A full analysis, of course, would need to push the en¬ 
tirety of the Tizen codebase through LLVM, and this ef¬ 
fort would need to be replicated each and every time 
the analysis was to be conducted. If our vision of our 
tool being closely integrated in the Tizen build environ¬ 
ment were to ever take off, Tizen would realistically need 
to switch to LLVM as its production compiler. With 
LLVM in production use by a number of very promi¬ 
nent projects, include Apple’s iOS / OS X, this isn’t an 
unreasonable recommendation. 
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7 Related Work 


7.1 Static analysis of production code 

Static analysis has been proven to be successful in find¬ 
ing bugs in real-world programs. Coverity m and 
Fortify ll^ are well-known commercial static analysis 
tools. An article by Bessey, et al. m discusses a num¬ 
ber of pragmatic issues and experiences with respect to 
static analysis tools for finding bugs for large commer¬ 
cial code bases (up to 20-30 MLOC). They observe that 
"the false positive rate is simplistic since false positives 
are not all equal and initial reports matter inordinately". 
Both Fortify If30l and Coverity emphasize results prior¬ 
itization once vulnerabilities are identified. Our ranking 
and cutoff analysis (Sectionalso addresses this issue. 


We discuss related ranking work in Section 7.3 


IBM AppScan Source OTIl is a tool meant to iden¬ 
tify bugs during the development phase for web applica¬ 
tions. Other editions of IBM AppScan identify general 
bugs while focusing on security problems in particular 
and supporting customizable rules. 

FindBugs El, a static analysis tool used on Google 
code bases, focuses more on identifying common Java 
programming bugs rather than security vulnerabilities in 
particular. The importance of the tool’s UI with respect 
to the speed of understanding and fixing bugs has been 
demons tated 0 (analysts processed bugs in FindBugs 
faster than with Fortify). The tool was used to show that 
bugs found in older code bases are less likely to be fixed 
once discovered Q. 

ESC/JAVA El is a static analysis tool, pow¬ 
ered by verification-condition generation and automatic 
theorem-proving techniques, for Java that checks for 
common programming errors. While it does find errors, 
users have to annotate the software and the annotation 
burden is quite high. It also suffers from excessive spu¬ 
rious warnings on programs that are annotated. 

Metal ll29l is a language for programmer-written com¬ 
piler extensions that express a broad range of correctness 
rules that code must obey. The system xgcc executes 
these extensions using a context-sensitive interprocedu¬ 
ral analysis. Metal is designed for system programmers 
with an emphasis on ease of use, and makes use of state 
machines as a fundamental abstraction. This approach 
has been used to find thousands of bugs in real systems 
code. 


7.2 Security analysis of mobile applica¬ 
tions 

Privilege escalation attacks on mobile applications are 
known to the community. In particular, the vulnerabil¬ 
ity of Android applications m is well known. Android, 


like Tizen, is a permissions-based mobile operating sys¬ 
tem, so analysis of possible permission leak vulnerabili¬ 
ties is also needed for it. 

ScanDroid was the first static analysis tool for 
Android to detect information flow violations. The tool 
detects inter-application security risks and needs to have 
access to both the vulnerable application and the ex¬ 
ploitable application. To the best of our knowledge, 
SCanDroid is not easily extensible with new taint prop¬ 
agation rules, unlike SIFA which is designed from the 
ground up for supporting custom rules. 

FlowDroid 0 is a static taint-analysis tool for An¬ 
droid applications, based on the Heros FDS/IDE solver 
and the Soot Java analysis framework. It models the 
Android application life cycle, including multiple entry 
points, asynchronously executing components, and call¬ 
backs. It performs context-, flow-, field-, and object- 
sensitive analyses to discover vulnerabilities in applica¬ 
tions. ElowDroid has excellent performance because it 
performs on demand alias analysis, but as described in 
ElowDroid 0 it does not handle implicit flows through 
control dependences. 

Grace et al. EtII focus on static analysis of stock An¬ 
droid firmware and identify confused deputy attacks that 
enable the use of permission-protected capabilities. Our 
application analysis is complementary in that it identifies 
not only actions that are performed, but information that 
flows to attackers. Our focus is not on stock applications, 
but on third-party applications. 

chexBoI , relies on taint analysis to discover permis¬ 
sion leaks in Android applications. CHEX detects sev¬ 
eral types of vulnerabilities affecting Android applica¬ 
tions, including permission-protected information leaks. 
The CHEX analysis is similar to our application anal¬ 
ysis, but relies on a model of the OS libraries rather 
than analyzing them directly. This avoids handling the 
multi-language analysis difficulties that Tizen and An¬ 
droid have. 

TaintDroid ifT^ uses dynamic taint tracking to identify 
protected information flows that reach Android network 
communication APIs (sinks). Advantages of performing 
this analysis dynamically are increased precision, as well 
as enforcement of the safe use of vulnerable applications 
by denying users the capability to externalize their sen¬ 
sitive information during application use. The advantage 
of static analysis tools such as SIEA is their capability of 
detecting vulnerable applications before they even reach 
the user. 

ComDroid lfT2l is a tool that analyses inter-application 
communication in Android. ComDroid does not track 
permission-leak vulnerabilities and none of the dis¬ 
covered vulnerabilities described pertain to permission- 
protected information. Contributions such as automatic 
rule generation separate our work from theirs. 
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Kirin iSEol is a tool based on a formal represen¬ 
tation of the Android security model that checks if ap¬ 
plications meet security policies. It can check for con¬ 
fused deputy vulnerabilities (“unchecked interface”). In¬ 
tent spoofing (“intent origin”) and other attacks by us¬ 
ing a powerful Prolog-based security policy enforcement 
mechanism, which takes into consideration the set of 
applications already installed on a device. The authors 
point out several difficulties with creating information 
flow policies in Android and discuss the future possibil¬ 
ity of including source code analysis to make information 
flow policies for Android of practical use. 

Felt et al. Il22l map Android API calls to permissions 
based on automated testing rather than static analysis, 
which means incomplete coverage and the possibility of 
false negatives in the permissions map. They do not use 
the map to check for information flow-based vulnerabil¬ 
ities in applications. PScout builds a permission map 
for Android through static analysis based on Soot. 

A different aspect of the flow vulnerabilities is de¬ 
scribed by Claudio Marforio et al., whose work focuses 
on colluding applications El. They identify several 
possible covert channels through which malevolent ap¬ 
plications can communicate sensitive information, for 
example by enumerating processes using native code or 
files. Most of these however are not Android specific. 
They did not build a tool to detect flow vulnerabilities. 
They identify security risks for colluding applications in 
modern permission-based operating systems. 

The PermissionFlow tool ED performs a static 
dataflow analysis to identify sources of information pro¬ 
tected by permissions in Android and a taint analysis to 
check if this information reaches other applications or 
leaks outside the device. The source APIs are, in contrast 
to the work of Felt et al., obtained through static analysis. 
PermissionFlow does not offer any support for implicit 
flows. Bartel et al. Q propose a similar taint analysis and 
both were able to And vulnerabilities in commercial ap¬ 
plications, highlighting the importance of performing a 
corresponding analysis on Tizen applications. — which 
is what our tool does. 

7.3 Results Ranking 

To the best of our knowledge we are the first to use er¬ 
ror ranking and cutoff as means to reduce the false posi¬ 
tive rate in security analysis, but there is a long history 
of using these techniques in static analysis tools. We 
use error ranking to both suppress false positives and to 
prominently display errors that are considered to be of 
importance to the user. The goal of report prioritization 
is to display errors according to their importance to the 
user. An article by Bessey, et al. ifTSll observes that the 
most prominently displayed reports are critical and have 


a strong impact on the user’s perception of the quality of 
the tool. 

An early tool to use error ranking for results of a 
static analysis is Prefix cni, which focuses on analysis 
of memory allocation and usage errors in C. It was an 
essential tool for improving the reliability of Windows 
OSlUD- Because of the high volume of warnings gen¬ 
erated, Prefix uses a set of ad hoc Alters to improve the 
relevance of the warnings displayed. 

Several tools, such as Z-Ranking ED, Feedback¬ 
ranking 0^ and Airac propose the idea of using 
statistical modeling to obtain better ranking of positives. 
Kremenek observed ESI that bugs often cluster by code 
locality and attributes this characteristic to the observa¬ 
tion that programmers who do violate a particular pro¬ 
gramming rule tend to violate it multiple times. Code 
locality plays a role in our ranking function too, but the 
correlation is instead between the confidence in a result 
being a true positive and the code span of the taint. 

FindBugs Q performs report prioritization by com¬ 
bining several factors such as confidence in a result and 
the seriousness of the bug. In our system, report prior¬ 
itization is accomplished by sorting security violations 
according to severity. As with our work, the severity 
is provided based on user-specified values in the input 
taint rules. EspX classifies bugs in different buck¬ 
ets based on both its confidence in the error being a true 
positive and on the severity of the bug. Fixing all bugs in 
designated buckets was a requirement to integrate code 
in the Windows OS IflTl . 

7.4 Implicit Flow 

In this section, we compare SIFA with other work in 
the area of analysis of implicit flows. The implicit 
flows considered here only include control flows and not 
covert channels which in general cannot be secured with 
software-only approaches. The importance and difficulty 
of handling implicit flow is presented in numerous stud¬ 
ies E5][ni. The detection of implicit flows by either 
static or dynamic analysis has proven to be challenging. 
We have developed a static taint analysis that unifies im¬ 
plicit and explicit information flows in a single analysis 
mechanism. 

Liu and Milanova ll3^ develop a context-sensitive in¬ 
terprocedural static information flow inference analysis 
which performs security type inference. A security type 
system requires the annotation of variables and state¬ 
ments with security types, which are labels that denote 
security levels ll42ll . They handle both explicit and im¬ 
plicit flows. Their method captures control dependences 
through adding implicit flow edges and paths, some of 
which are annotated by the analysis. They perform a 
static taint analysis based on this representation. 
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Genaim and Spoto f26i present an information flow 
analysis for both explicit and implicit flows for full 
(mono-threaded) Java bytecode. They build a control 
flow graph that represents the complex control features 
of Java bytecode. For efficiency, they represent informa¬ 
tion flows through Boolean functions. They treat fields 
as static (i.e., global) class variables, and so do not dis¬ 
tinguish flow between the same held of multiple objects 
of a given class, a significant source of imprecision. In 
contrast, we model objects using Heap SSA form, and 
so distinguish between the instances of the same held in 
different objects. 

While taint analysis is effective for detecting a wide 
range of attacks on benign software, Cavallero, et al. im 
show that it is not as effective for detecting attacks due 
to malicious software. In particular, they present sim¬ 
ple and powerful evasion techniques, used in untrusted 
x86 binaries, that elude static and dynamic taint-tracking 
techniques. They report that enhancing taint analysis to 
reason about control dependences, as our method does, 
improves evasion resistance but results in a high rate of 
false positives. This could limit the usefulness of such 
techniques, given the wide use of binary-based software 
distribution and employment models. This difficulty mo¬ 
tivates the use of trusted LLVM bitcodes as a distribution 
format. 

King, et al. El experimentally investigate the value 
of tracking implicit flows through the security-typed lan¬ 
guage JLift, an extension of Jif. They And that implicit 
flow checking can be valuable, in terms of identifying 
true leaks of secret information, but produces a high 
(83%) rate of false positives (over-tainting), in particu¬ 
lar due to unchecked exceptions. 

It has been shown that purely dynamic techniques can¬ 
not detect certain implicit flows Il58]l . so the applica¬ 
tion of dynamic taint analysis to implicit flows results 
in false negatives. However to mitigate the issue of static 
over-tainting, dynamic taint analysis has been applied to 
implicit flows through techniques that selectively prop¬ 
agate taints along a targeted subset of control depen¬ 
dences EH mm. Our use of ranking and cutoff can 
also be used to mitigate over-tainting. 

8 Conclusions and Future Work 

Analyzing the security of a large software platform like 
Tizen presents a valuable opportunity to apply state- 
of-the-art tools in static analysis. Static analysis can 
be usefully applied to identify undesirable behaviors 
in apps distributed through app stores, and it can help 
the system’s developers And needle-in-a-haystack bugs 
throughout their system. While we had a limited li¬ 
brary of apps to consider, we were able to achieve very 
good analysis performance and were able to identify non¬ 


trivial information flows that could be dangerous in un¬ 
trusted apps. Similarly, by processing a substantial frac¬ 
tion of the Tizen codebase, we were able to identify 
a handful of locations where important security checks 
were missing; subsequent manual analysis determined 
that subsequent software layers made checks that pre¬ 
vented these initial mistakes from becoming exploitable. 

Our work additionally demonstrates the value of a 
general-purpose infrastructure like LLVM. While this 
project focused on C and C-n- code, our analyses could 
potentially run on any programming language for which 
there’s an LLVM front-end. For example, a JavaScript 
front-end for LLVM would allow our tools to analyze 
“web apps” in addition to “native apps” with identical 
information flow rules. 

Furthermore, the extensions we made to LLVM, such 
as our class hierarchy analysis, are general-purpose and 
could well be folded back into the LLVM distribution. 
(We intend to make an open source release of our ex¬ 
tensions.) We hypothesize that the increased precision 
of our analyses will enable dynamic dispatches to be 
replaced with static function calls, as well as allowing 
for better function inlining and other performance ben¬ 
efits. Evaluating this performance impact represents fu¬ 
ture work. 

Now that Samsung has shipped its first Tizen products 
and real apps are starting to appear in its online app store, 
we expect that independent security analysts will be able 
to download these apps, in bulk, and analyze them as 
many security analysts have already done for Android 
and iOS. The Tizen platform is still in its early days as 
a consumer product, creating opportunities for the plat¬ 
form’s security features to get ahead of attackers. 
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